Methods and systems for intra block copy coding with block vector derivation

ABSTRACT

Systems and methods are described for encoding and decoding video using derived block vectors as predictors in intra block copy mode. In an exemplary encoding method, an encoder identifies at least a first candidate block vector for the prediction of an input video block, where the first candidate block vector points to a first candidate block. The encoder then identifies a first predictive vector (e.g. a block vector or a motion vector) that was used to encode the first candidate block. From the first candidate block vector and the first predictive vector, the encoder generates a derived predictive vector from the first candidate block vector and the first predictive vector. The encoder then encodes the video block in the bit stream using the derived predictive vector for the prediction of the input video block.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation of U.S. patent applicationSer. No. 16/287,505, filed Feb. 27, 2019, which is a continuation ofU.S. patent application Ser. No. 15/875,885, filed Jan. 19, 2018 (nowU.S. Pat. No. 10,284,874), which is a continuation of U.S. patentapplication Ser. No. 14/743,657, filed Jun. 18, 2015 (now U.S. Pat. No.9,877,043), which is a non-provisional filing of, and claims benefitunder 35 U.S.C. § 119(e) from, U.S. Provisional Patent Application Ser.No. 62/014,664, filed Jun. 19, 2014. The contents of these applicationsare incorporated herein by reference in their entirety.

BACKGROUND

Over the past two decades, various digital video compressiontechnologies have been developed and standardized to enable efficientdigital video communication, distribution and consumption. Most of thecommercially widely deployed standards are developed by ISO/IEC andITU-T, such as H.261, MPEG-1, MPEG-2 H.263, MPEG-4 (part-2), andH.264/AVC (MPEG-4 part 10 Advance Video Coding). Due to the emergenceand maturity of new advanced video compression technologies, a new videocoding standard, High Efficiency Video Coding (HEVC), under jointdevelopment by ITU-T Video Coding Experts Group (VCEG) and ISO/IEC MPEG.HEVC (ITU-T H.265/ ISO/IEC 23008-2) was approved as an internationalstandard in early 2013 and is able to achieve substantially highercoding efficiency than the current state-of-the-art H.264/AVC.

Screen content sharing applications have become more and more popular inrecent years with the proliferation of remote desktop, videoconferencing and mobile media presentation applications. A two-wayscreen content sharing system may include a host sub-system including acapturer, encoder and transmitter, and a client sub-system including areceiver, decoder and display (renderer). There are various applicationrequirements from industries for screen content coding (SCC). Ascompared to natural video content, screen content often containsnumerous blocks with several major colors and strong edges because ofsharp curves and text that frequently appears in screen content.Although existing video compression methods can be used to encode screencontent and then transmit that content to the receiver side, mostexisting methods do not accommodate the characteristics of screencontent and thus lead to a low compression performance. Thereconstruction of screen content using conventional video codingtechnologies often leads to serious quality issues. For example, thecurves and texts are blurred and may be difficult to recognize.Therefore, a well-designed screen-content compression method isdesirable for effectively reconstructing screen content.

SUMMARY

In some exemplary embodiments, a method is provided for generating a bitstream encoding a video that includes an input video block. An encoderidentifies at least a first candidate block vector (BV) for predictionof the input video block, where the first candidate block vector pointsto a first candidate block. The encoder then identifies a firstpredictive vector (e.g. a block vector or a motion vector) that was usedto encode the first candidate block. From the first candidate blockvector and the first predictive vector, the encoder generates a derivedpredictive vector (e.g. a derived block vector or a derived motionvector) from the first candidate block vector and the first predictivevector. The encoder then encodes the video block in the bit stream usingthe derived predictive vector for prediction of the input video block.

In some embodiments, the encoder signals the derived predictive vectorin the bit stream. In some embodiments, the encoder signals the firstpredictive vector in the bit stream and also signals a flag in the bitstream indicating that the input video block is encoded using thederived predictive vector that is derived from the first predictivevector.

In some embodiment, the encoder signals in the bit stream an indexidentifying the derived predictive vector in a merge candidate list.

The derived predictive vector may be generated by adding the firstcandidate block vector and the first predictive vector. In suchembodiments, where the first predictive vector is a second block vector,the derived predictive vector may be a block vector generated by addingthe first candidate block vector and the second block vector (the firstpredictive vector). If the first predictive vector is a motion vector,the derived predictive vector may be a motion vector generated by addingthe first candidate block vector and the first motion vector accordingto the equation

MVd=BVO0+((MV1+2)>>2),

where BV0 is the first candidate block vector, MV1 is the first motionvector, and MVd is the derived motion vector.

In some exemplary embodiments, derived predictive vectors (block vectorsor motion vectors) are used as merge candidates. In an exemplary method,an encoder identifies at least a first block vector merge candidate forencoding of the input video block, and the encoder identifies a firstpredictive vector that was used to encode the first candidate block. Theencoder then generates a derived predictive vector (a derived blockvector or derived motion vector) from the first block vector mergecandidate and the first predictive vector. The derived predictive vectoris inserted in a merge candidate list. From the merge candidate list,the encoder chooses a selected predictive vector for the prediction ofthe input video block. The encoder then encodes the video input block inthe bit stream using the selected predictive vector for the predictionof the input video block. The selected predictive vector may be thederived predictive vector.

In some such embodiments, the encoder determines whether the mergecandidate list is full before generating and inserting the derivedpredictive vector. The steps of generating and inserting the derivedpredictive vector in the merge candidate list are performed only after adetermination is made that the merge candidate list is not full.

In some such embodiments, the encoder identifies the first candidateblock vector by conducting a search of previously-encoded video blocks.

In an exemplary method of decoding a coded video block from a bitstream, a decoder identifies at least a first candidate block vector forthe prediction of the input video block, wherein the first candidateblock vector points to a first candidate block. The decoder identifies afirst predictive vector used to encode the first candidate block. Thedecoder then generates a derived predictive vector from the first blockvector and the first predictive vector and decodes the coded video blockusing the derived predictive vector for the prediction of the codedvideo block.

In such embodiments, the first candidate block vector may be identifiedusing various different techniques. In one such method, the firstcandidate block vector is signaled in the bit stream, and identificationof the first candidate block vector includes receiving the firstcandidate block vector signaled in the bit stream. In such a method, thegeneration of the derived predictive vector may be performed in responseto receiving a flag in the bit stream indicating that the input videoblock is encoded with a derived predictive vector. In another suchmethod, the identification of a first candidate block vector includesidentification of a first block vector merge candidate. In such anembodiment, the derived predictive vector may also be a merge candidate.The decoder may use the derived predictive vector to decode the codedvideo block in response to receiving an index in the bit streamidentifying the derived predictive vector merge candidate.

BRIEF DESCRIPTION OF THE DRAWINGS

A more detailed understanding may be had from the following description,presented by way of example in conjunction with the accompanyingdrawings, which are first briefly described below.

FIG. 1 is a block diagram illustrating an example of a block-based videoencoder.

FIG. 2 is a block diagram illustrating an example of a block-based videodecoder.

FIG. 3 is a diagram of an example of eight directional prediction modes.

FIG. 4 is a diagram illustrating an example of 33 directional predictionmodes and two non-directional prediction modes.

FIG. 5 is a diagram of an example of horizontal prediction.

FIG. 6 is a diagram of an example of the planar mode.

FIG. 7 is a diagram illustrating an example of motion prediction.

FIG. 8 is a diagram illustrating an example of block-level movementwithin a picture.

FIG. 9 is a diagram illustrating an example of a coded bitstreamstructure.

FIG. 10 is a diagram illustrating an example communication system.

FIG. 11 is a diagram illustrating an example wireless transmit/receiveunit (WTRU).

FIG. 12 is a diagram illustrating an example screen-content-sharingsystem.

FIG. 13 is a diagram illustrating an example of full frame intra blockcopy mode.

FIG. 14 is a diagram illustrating an example of local region intra blockcopy mode.

FIG. 15 is a diagram illustrating two examples of spatial candidates forintra block copy merge.

FIG. 16 is a diagram illustrating an example block vector derivation.

FIG. 17 is a diagram illustrating an example motion vector derivation.

FIGS. 18A and 18B together are a flowchart of an example method.

DETAILED DESCRIPTION

A detailed description of illustrative embodiments will now be providedwith reference to the various Figures. Although this descriptionprovides detailed examples of possible implementations, it should benoted that the provided details are intended to be by way of example andin no way limit the scope of the application.

FIG. 1 is a block diagram illustrating an example of a block-based videoencoder, for example, a hybrid video encoding system. The video encoder100 may receive an input video signal 102. The input video signal 102may be processed block by block. A video block may be of any size. Forexample, the video block unit may include 16×16 pixels. A video blockunit of 16×16 pixels may be referred to as a macroblock (MB). In HighEfficiency Video Coding (HEVC), extended block sizes (e.g., which may bereferred to as a coding tree unit (CTU) or a coding unit (CU), two termswhich are equivalent for purposes of this disclosure) may be used toefficiently compress high-resolution (e.g., 1080p and beyond) videosignals. In HEVC, a CU may be up to 64×64 pixels. A CU may bepartitioned into prediction units (PUs), for which separate predictionmethods may be applied.

For an input video block (e.g., an MB or a CU), spatial prediction 160and/or temporal prediction 162 may be performed. Spatial prediction(e.g., “intra prediction”) may use pixels from already coded neighboringblocks in the same video picture/slice to predict the current videoblock. Spatial prediction may reduce spatial redundancy inherent in thevideo signal. Temporal prediction (e.g., “inter prediction” or “motioncompensated prediction”) may use pixels from already coded videopictures (e.g., which may be referred to as “reference pictures”) topredict the current video block. Temporal prediction may reduce temporalredundancy inherent in the video signal. A temporal prediction signalfor a video block may be signaled by one or more motion vectors, whichmay indicate the amount and/or the direction of motion between thecurrent block and its prediction block in the reference picture. Ifmultiple reference pictures are supported (e.g., as may be the case forH.264/AVC and/or HEVC), then for a video block, its reference pictureindex may be sent. The reference picture index may be used to identifyfrom which reference picture in a reference picture store 164 thetemporal prediction signal comes.

The mode decision block 180 in the encoder may select a prediction mode,for example, after spatial and/or temporal prediction. The predictionblock may be subtracted from the current video block at 116. Theprediction residual may be transformed 104 and/or quantized 106. Thequantized residual coefficients may be inverse quantized 110 and/orinverse transformed 112 to form the reconstructed residual, which may beadded back to the prediction block 126 to form the reconstructed videoblock.

In-loop filtering (e.g., a deblocking filter, a sample adaptive offset,an adaptive loop filter, and/or the like) may be applied 166 to thereconstructed video block before it is put in the reference picturestore 164 and/or used to code future video blocks. The video encoder 100may output an output video stream 120. To form the output videobitstream 120, a coding mode (e.g., inter prediction mode or intraprediction mode), prediction mode information, motion information,and/or quantized residual coefficients may be sent to the entropy codingunit 108 to be compressed and/or packed to form the bitstream. Thereference picture store 164 may be referred to as a decoded picturebuffer (DPB).

FIG. 2 is a block diagram illustrating an example of a block-based videodecoder. The video decoder 200 may receive a video bitstream 202. Thevideo bitstream 202 may be unpacked and/or entropy decoded at entropydecoding unit 208. The coding mode and/or prediction information used toencode the video bitstream may be sent to the spatial prediction unit260 (e.g., if intra coded) and/or the temporal prediction unit 262(e.g., if inter coded) to form a prediction block. If inter coded, theprediction information may comprise prediction block sizes, one or moremotion vectors (e.g., which may indicate direction and amount ofmotion), and/or one or more reference indices (e.g., which may indicatefrom which reference picture to obtain the prediction signal).Motion-compensated prediction may be applied by temporal prediction unit262 to form a temporal prediction block.

The residual transform coefficients may be sent to an inversequantization unit 210 and an inverse transform unit 212 to reconstructthe residual block. The prediction block and the residual block may beadded together at 226. The reconstructed block may go through in-loopfiltering 266 before it is stored in reference picture store 264. Thereconstructed video in the reference picture store 264 may be used todrive a display device and/or used to predict future video blocks. Thevideo decoder 200 may output a reconstructed video signal 220. Thereference picture store 264 may also be referred to as a decoded picturebuffer (DPB).

A video encoder and/or decoder (e.g., video encoder 100 or video decoder200) may perform spatial prediction (e.g., which may be referred to asintra prediction). Spatial prediction may be performed by predictingfrom already coded neighboring pixels following one of a plurality ofprediction directions (e.g., which may be referred to as directionalintra prediction).

FIG. 3 is a diagram of an example of eight directional prediction modes.The eight directional prediction modes of FIG. 3 may be supported inH.264/AVC. As shown generally at 300 in FIG. 3, the nine modes(including DC mode 2) are:

-   -   Mode 0: Vertical Prediction    -   Mode 1: Horizontal prediction    -   Mode 2: DC prediction    -   Mode 3: Diagonal down-left prediction    -   Mode 4: Diagonal down-right prediction    -   Mode 5: Vertical-right prediction    -   Mode 6: Horizontal-down prediction    -   Mode 7: Vertical-left prediction    -   Mode 8: Horizontal-up prediction

Spatial prediction may be performed on a video block of various sizesand/or shapes. Spatial prediction of a luma component of a video signalmay be performed, for example, for block sizes of 4×4, 8×8, and 16×16pixels (e.g., in H.264/AVC). Spatial prediction of a chroma component ofa video signal may be performed, for example, for block size of 8×8(e.g., in H.264/AVC). For a luma block of size 4×4 or 8×8, a total ofnine prediction modes may be supported, for example, eight directionalprediction modes and the DC mode (e.g., in H.264/AVC). Four predictionmodes may be supported; horizontal, vertical, DC, and planar prediction,for example, for a luma block of size 16×16.

Furthermore, directional intra prediction modes and non-directionalprediction modes may be supported.

FIG. 4 is a diagram illustrating an example of 33 directional predictionmodes and two non-directional prediction modes. The 33 directionalprediction modes and two non-directional prediction modes, showngenerally at 400 in FIG. 4, may be supported by HEVC. Spatial predictionusing larger block sizes may be supported. For example, spatialprediction may be performed on a block of any size, for example, ofsquare block sizes of 4×4, 8×8, 16×16, 32×32, or 64×64. Directionalintra prediction (e.g., in HEVC) may be performed with 1/32-pixelprecision.

Non-directional intra prediction modes may be supported (e.g., inH.264/AVC, HEVC, or the like), for example, in addition to directionalintra prediction. Non-directional intra prediction modes may include theDC mode and/or the planar mode. For the DC mode, a prediction value maybe obtained by averaging the available neighboring pixels and theprediction value may be applied to the entire block uniformly. For theplanar mode, linear interpolation may be used to predict smooth regionswith slow transitions. H.264/AVC may allow for use of the planar modefor 16×16 luma blocks and chroma blocks.

An encoder (e.g., the encoder 100) may perform a mode decision (e.g., atblock 180 in FIG. 1) to determine the best coding mode for a videoblock. When the encoder determines to apply intra prediction (e.g.,instead of inter prediction), the encoder may determine an optimal intraprediction mode from the set of available modes. The selecteddirectional intra prediction mode may offer strong hints as to thedirection of any texture, edge, and/or structure in the input videoblock.

FIG. 5 is a diagram of an example of horizontal prediction (e.g., for a4×4 block), as shown generally at 500 in FIG. 5. Already reconstructedpixels P0, P1, P2 and P3 (i.e., the shaded boxes) may be used to predictthe pixels in the current 4×4 video block. In horizontal prediction, areconstructed pixel, for example, pixels P0, P1, P2 and/or P3, may bepropagated horizontally along the direction of a corresponding row topredict the 4×4 block. For example, prediction may be performedaccording to Equation (1) below, where L(x, y) may be the pixel to bepredicted at (x, y), x,y=0 . . . 3.

L(x,0)=P0

L(x,1)=P1

L(x,2)=P2

L(x,3)=P3   (1)

FIG. 6 is a diagram of an example of the planar mode, as shown generallyat 600 in FIG. 6. The planar mode may be performed accordingly: therightmost pixel in the top row (marked by a T) may be replicated topredict pixels in the rightmost column. The bottom pixel in the leftcolumn (marked by an L) may be replicated to predict pixels in thebottom row. Bilinear interpolation in the horizontal direction (as shownin the left block) may be performed to produce a first prediction H(x,y)of center pixels. Bilinear interpolation in the vertical direction(e.g., as shown in the right block) may be performed to produce a secondprediction V(x,y) of center pixels. An averaging between the horizontalprediction and the vertical prediction may be performed to obtain afinal prediction L(x,y), using L(x,y)=((H(x,y)+V(x,y))>>1).

FIG. 7 and FIG. 8 are diagrams illustrating, as shown generally at 700and 800, an example of motion prediction of video blocks (e.g., usingtemporal prediction unit 162 of FIG. 1). FIG. 8, which illustrates anexample of block-level movement within a picture, is a diagramillustrating an example decoded picture buffer including, for example,reference pictures “Ref pic 0,” “Ref pic 1,” and “Ref pic2.” The blocksB0, B1, and B2 in a current picture may be predicted from blocks inreference pictures “Ref pic 0,” “Ref pic 1,” and “Ref pic2”respectively. Motion prediction may use video blocks from neighboringvideo frames to predict the current video block. Motion prediction mayexploit temporal correlation and/or remove temporal redundancy inherentin the video signal. For example, in H.264/AVC and HEVC, temporalprediction may be performed on video blocks of various sizes (e.g., forthe luma component, temporal prediction block sizes may vary from 16×16to 4×4 in H.264/AVC, and from 64×64 to 4×4 in HEVC). With a motionvector of (mvx, mvy), temporal prediction may be performed as providedby equation (2):

P(x,y)=ref(x−mvx,y−mvy)   (2)

where ref(x,y) may be pixel value at location (x, y) in the referencepicture, and P(x,y) may be the predicted block. A video coding systemmay support inter-prediction with fractional pixel precision. When amotion vector (mvx, mvy) has fractional pixel value, one or moreinterpolation filters may be applied to obtain the pixel values atfractional pixel positions. Block-based video coding systems may usemulti-hypothesis prediction to improve temporal prediction, for example,where a prediction signal may be formed by combining a number ofprediction signals from different reference pictures. For example,H.264/AVC and/or HEVC may use bi-prediction that may combine twoprediction signals. Bi-prediction may combine two prediction signals,each from a reference picture, to form a prediction, such as thefollowing equation (3):

$\begin{matrix}{{P( {x,y} )} = {\frac{{P_{0}( {x,y} )} + {P_{1}( {x,y} )}}{2} = \frac{{{ref}_{0}( {{x - {mvx_{0}}},{y - {mvy_{0}}}} )} + {{ref}_{1}( {{x - {mvx_{1}}},{y - {mvy_{1}}}} )}}{2}}} & (3)\end{matrix}$

where P₀(x,y) and P₁(x,y) may be the first and the second predictionblock, respectively. As illustrated in equation (3), the two predictionblocks may be obtained by performing motion-compensated prediction fromtwo reference pictures ref₀(x,y) and ref₁(x, y), with two motion vectors(mvx₀, mvy₀) and (mvx₁, mvy₁), respectively. The prediction block P(x,y)may be subtracted from the source video block (e.g., at 116) to form aprediction residual block. The prediction residual block may betransformed (e.g., at transform unit 104) and/or quantized (e.g., atquantization unit 106). The quantized residual transform coefficientblocks may be sent to an entropy coding unit (e.g., entropy coding unit108) to be entropy coded to reduce bit rate. The entropy coded residualcoefficients may be packed to form part of an output video bitstream(e.g., bitstream 120).

A single layer video encoder may take a single video sequence input andgenerate a single compressed bit stream transmitted to the single layerdecoder. A video codec may be designed for digital video services (e.g.,such as but not limited to sending TV signals over satellite, cable andterrestrial transmission channels). With video centric applicationsdeployed in heterogeneous environments, multi-layer video codingtechnologies may be developed as an extension of the video codingstandards to enable various applications. For example, multiple layervideo coding technologies, such as scalable video coding and/ormulti-view video coding, may be designed to handle more than one videolayer where each layer may be decoded to reconstruct a video signal of aparticular spatial resolution, temporal resolution, fidelity, and/orview. Although a single layer encoder and decoder are described withreference to FIG. 1 and FIG. 2, the concepts described herein mayutilize a multiple layer encoder and/or decoder, for example, formulti-view and/or scalable coding technologies.

Scalable video coding may improve the quality of experience for videoapplications running on devices with different capabilities overheterogeneous networks. Scalable video coding may encode the signal onceat a highest representation (e.g., temporal resolution, spatialresolution, quality, etc.), but enable decoding from subsets of thevideo streams depending on the specific rate and representation requiredby certain applications running on a client device. Scalable videocoding may save bandwidth and/or storage compared to non-scalablesolutions. The international video standards, e.g., MPEG-2 Video, H.263,MPEG4 Visual, H.264, etc., may have tools and/or profiles that supportmodes of scalability.

Table 1 provides an example of different types of scalabilities alongwith the corresponding standards that may support them. Bit-depthscalability and/or chroma format scalability may be tied to videoformats (e.g., higher than 8-bit video, and chroma sampling formatshigher than YUV4:2:0), for example, which may primarily be used byprofessional video applications. Aspect ratio scalability may beprovided.

TABLE 1 Scalability Example Standards View scalability 2 D→3 D (2 ormore views) MVC, MFC, 3DV Spatial scalability 720 p→4080 p SVC, scalableHEVC Quality (SNR) scalability 35 dB→38 dB SVC, scalable HEVC Temporalscalability 30 fps→60 fps H.264/AVC, SVC, scalable HEVC Standardsscalability H.264/AVC→HEVC 3DV, scalable HEVC Bit-depth scalability8-bit video → 10-bit video Scalable HEVC Chroma format scalabilityYUV4:2:0→YUV4:2:2, Scalable HEVC YUV4:4:4 Color Gamut Scalability BT.709→ BT.2020 Scalable HEVC Aspect ratio scalability 4:3→16:9 Scalable HEVC

Scalable video coding may provide a first level of video qualityassociated with a first set of video parameters using the base layerbitstream. Scalable video coding may provide one or more levels ofhigher quality associated with one or more sets of enhanced parametersusing one or more enhancement layer bitstreams. The set of videoparameters may include one or more of spatial resolution, frame rate,reconstructed video quality (e.g., in the form of SNR, PSNR, VQM, visualquality, etc.), 3D capability (e.g., with two or more views), luma andchroma bit depth, chroma format, and underlying single-layer codingstandard. Different use cases may use different types of scalability,for example, as illustrated in Table 1. A scalable coding architecturemay offer a common structure that may be configured to support one ormore scalabilities (e.g., the scalabilities listed in Table 1). Ascalable coding architecture may be flexible to support differentscalabilities with minimum configuration efforts. A scalable codingarchitecture may include at least one preferred operating mode that maynot require changes to block level operations, such that the codinglogics (e.g., encoding and/or decoding logics) may be maximally reusedwithin the scalable coding system. For example, a scalable codingarchitecture based on a picture level inter-layer processing andmanagement unit may be provided, wherein the inter-layer prediction maybe performed at the picture level.

FIG. 9 is a diagram illustrating an example of a coded bitstreamstructure. A coded bitstream 1000 consists of a number of NAL (NetworkAbstraction layer) units 1001. A NAL unit may contain coded sample datasuch as coded slice 1006, or high level syntax metadata such asparameter set data, slice header data 1005 or supplemental enhancementinformation data 1007 (which may be referred to as an SEI message).Parameter sets are high level syntax structures containing essentialsyntax elements that may apply to multiple bitstream layers (e.g. videoparameter set 1002 (VPS)), or may apply to a coded video sequence withinone layer (e.g. sequence parameter set 1003 (SPS)), or may apply to anumber of coded pictures within one coded video sequence (e.g. pictureparameter set 1004 (PPS)). The parameter sets can be either senttogether with the coded pictures of the video bit stream, or sentthrough other means (including out-of-band transmission using reliablechannels, hard coding, etc.). Slice header 1005 is also a high levelsyntax structure that may contain some picture-related information thatis relatively small or relevant only for certain slice or picture types.SEI messages 1007 carry the information that may not be needed by thedecoding process but can be used for various other purposes such aspicture output timing or display as well as loss detection andconcealment.

FIG. 10 is a diagram illustrating an example of a communication system.The communication system 1300 may comprise an encoder 1302, acommunication network 1304, and a decoder 1306. The encoder 1302 may bein communication with the network 1304 via a connection 1308, which maybe a wireline connection or a wireless connection. The encoder 1302 maybe similar to the block-based video encoder of FIG. 1. The encoder 1302may include a single layer codec (e.g., FIG. 1) or a multilayer codec.For example, the encoder 1302 may be a multi-layer (e.g., two-layer)scalable coding system with picture-level ILP support. The decoder 1306may be in communication with the network 1304 via a connection 1310,which may be a wireline connection or a wireless connection. The decoder1306 may be similar to the block-based video decoder of FIG. 2. Thedecoder 1306 may include a single layer codec (e.g., FIG. 2) or amultilayer codec. For example, the decoder 1306 may be a multi-layer(e.g., two-layer) scalable decoding system with picture-level ILPsupport.

The encoder 1302 and/or the decoder 1306 may be incorporated into a widevariety of wired communication devices and/or wireless transmit/receiveunits (WTRUs), such as, but not limited to, digital televisions,wireless broadcast systems, a network element/terminal, servers, such ascontent or web servers (e.g., such as a Hypertext Transfer Protocol(HTTP) server), personal digital assistants (PDAs), laptop or desktopcomputers, tablet computers, digital cameras, digital recording devices,video gaming devices, video game consoles, cellular or satellite radiotelephones, digital media players, and/or the like.

The communications network 1304 may be a suitable type of communicationnetwork. For example, the communications network 1304 may be a multipleaccess system that provides content, such as voice, data, video,messaging, broadcast, etc., to multiple wireless users. Thecommunications network 1304 may enable multiple wireless users to accesssuch content through the sharing of system resources, including wirelessbandwidth. For example, the communications network 1304 may employ oneor more channel access methods, such as code division multiple access(CDMA), time division multiple access (TDMA), frequency divisionmultiple access (FDMA), orthogonal FDMA (OFDMA), single-carrier FDMA(SC-FDMA), and/or the like. The communication network 1304 may includemultiple connected communication networks. The communication network1304 may include the Internet and/or one or more private commercialnetworks such as cellular networks, WiFi hotspots, Internet ServiceProvider (ISP) networks, and/or the like.

FIG. 11 is a system diagram of an example WTRU. As shown the exampleWTRU 1202 may include a processor 1218, a transceiver 1220, atransmit/receive element 1222, a speaker/microphone 1224, a keypad orkeyboard 1226, a display/touchpad 1228, non-removable memory 1230,removable memory 1232, a power source 1234, a global positioning system(GPS) chipset 1236, and/or other peripherals 1238. It will beappreciated that the WTRU 1202 may include any sub-combination of theforegoing elements while remaining consistent with an embodiment.Further, a terminal in which an encoder (e.g., encoder 100) and/or adecoder (e.g., decoder 200) is incorporated may include some or all ofthe elements depicted in and described herein with reference to the WTRU1202 of FIG. 11.

The processor 1218 may be a general purpose processor, a special purposeprocessor, a conventional processor, a digital signal processor (DSP), agraphics processing unit (GPU), a plurality of microprocessors, one ormore microprocessors in association with a DSP core, a controller, amicrocontroller, Application Specific Integrated Circuits (ASICs), FieldProgrammable Gate Array (FPGAs) circuits, any other type of integratedcircuit (IC), a state machine, and the like. The processor 1218 mayperform signal coding, data processing, power control, input/outputprocessing, and/or any other functionality that enables the WTRU 1500 tooperate in a wired and/or wireless environment. The processor 1218 maybe coupled to the transceiver 1220, which may be coupled to thetransmit/receive element 1222. While FIG. 11 depicts the processor 1218and the transceiver 1220 as separate components, it will be appreciatedthat the processor 1218 and the transceiver 1220 may be integratedtogether in an electronic package and/or chip.

The transmit/receive element 1222 may be configured to transmit signalsto, and/or receive signals from, another terminal over an air interface1215. For example, in one or more embodiments, the transmit/receiveelement 1222 may be an antenna configured to transmit and/or receive RFsignals. In one or more embodiments, the transmit/receive element 1222may be an emitter/detector configured to transmit and/or receive IR, UV,or visible light signals, for example. In one or more embodiments, thetransmit/receive element 1222 may be configured to transmit and/orreceive both RF and light signals. It will be appreciated that thetransmit/receive element 1222 may be configured to transmit and/orreceive any combination of wireless signals.

In addition, although the transmit/receive element 1222 is depicted inFIG. 11 as a single element, the WTRU 1202 may include any number oftransmit/receive elements 1222. More specifically, the WTRU 1202 mayemploy MIMO technology. Thus, in one embodiment, the WTRU 1202 mayinclude two or more transmit/receive elements 1222 (e.g., multipleantennas) for transmitting and receiving wireless signals over the airinterface 1215.

The transceiver 1220 may be configured to modulate the signals that areto be transmitted by the transmit/receive element 1222 and/or todemodulate the signals that are received by the transmit/receive element1222. As noted above, the WTRU 1202 may have multi-mode capabilities.Thus, the transceiver 1220 may include multiple transceivers forenabling the WTRU 1500 to communicate via multiple RATs, such as UTRAand IEEE 802.11, for example.

The processor 1218 of the WTRU 1202 may be coupled to, and may receiveuser input data from, the speaker/microphone 1224, the keypad 1226,and/or the display/touchpad 1228 (e.g., a liquid crystal display (LCD)display unit or organic light-emitting diode (OLED) display unit). Theprocessor 1218 may also output user data to the speaker/microphone 1224,the keypad 1226, and/or the display/touchpad 1228. In addition, theprocessor 1218 may access information from, and store data in, any typeof suitable memory, such as the non-removable memory 1230 and/or theremovable memory 1232. The non-removable memory 1230 may includerandom-access memory (RAM), read-only memory (ROM), a hard disk, or anyother type of memory storage device. The removable memory 1232 mayinclude a subscriber identity module (SIM) card, a memory stick, asecure digital (SD) memory card, and the like. In one or moreembodiments, the processor 1218 may access information from, and storedata in, memory that is not physically located on the WTRU 1202, such ason a server or a home computer (not shown).

The processor 1218 may receive power from the power source 1234, and maybe configured to distribute and/or control the power to the othercomponents in the WTRU 1202. The power source 1234 may be any suitabledevice for powering the WTRU 1202. For example, the power source 1234may include one or more dry cell batteries (e.g., nickel-cadmium (NiCd),nickel-zinc (NiZn), nickel metal hydride (NiMH), lithium-ion (Li-ion),etc.), solar cells, fuel cells, and the like.

The processor 1218 may be coupled to the GPS chipset 1236, which may beconfigured to provide location information (e.g., longitude andlatitude) regarding the current location of the WTRU 1202. In additionto, or in lieu of, the information from the GPS chipset 1236, the WTRU1202 may receive location information over the air interface 1215 from aterminal (e.g., a base station) and/or determine its location based onthe timing of the signals being received from two or more nearby basestations. It will be appreciated that the WTRU 1202 may acquire locationinformation by way of any suitable location-determination method whileremaining consistent with an embodiment.

The processor 1218 may further be coupled to other peripherals 1238,which may include one or more software and/or hardware modules thatprovide additional features, functionality and/or wired or wirelessconnectivity. For example, the peripherals 1238 may include anaccelerometer, orientation sensors, motion sensors, a proximity sensor,an e-compass, a satellite transceiver, a digital camera and/or videorecorder (e.g., for photographs and/or video), a universal serial bus(USB) port, a vibration device, a television transceiver, a hands freeheadset, a Bluetooth® module, a frequency modulated (FM) radio unit, andsoftware modules such as a digital music player, a media player, a videogame player module, an Internet browser, and the like.

By way of example, the WTRU 1202 may be configured to transmit and/orreceive wireless signals and may include user equipment (UE), a mobilestation, a fixed or mobile subscriber unit, a pager, a cellulartelephone, a personal digital assistant (PDA), a smartphone, a laptop, anetbook, a tablet computer, a personal computer, a wireless sensor,consumer electronics, or any other terminal capable of receiving andprocessing compressed video communications.

The WTRU 1202 and/or a communication network (e.g., communicationnetwork 804) may implement a radio technology such as Universal MobileTelecommunications System (UMTS) Terrestrial Radio Access (UTRA), whichmay establish the air interface 1215 using wideband CDMA (WCDMA). WCDMAmay include communication protocols such as High-Speed Packet Access(HSPA) and/or Evolved HSPA (HSPA+). HSPA may include High-Speed DownlinkPacket Access (HSDPA) and/or High-Speed Uplink Packet Access (HSUPA).The WTRU 1202 and/or a communication network (e.g., communicationnetwork 804) may implement a radio technology such as Evolved UMTSTerrestrial Radio Access (E-UTRA), which may establish the air interface1515 using Long Term Evolution (LTE) and/or LTE-Advanced (LTE-A).

The WTRU 1202 and/or a communication network (e.g., communicationnetwork 804) may implement radio technologies such as IEEE 802.16 (e.g.,Worldwide Interoperability for Microwave Access (WiMAX)), CDMA2000,CDMA2000 1X, CDMA2000 EV-DO, Interim Standard 2000 (IS-2000), InterimStandard 95 (IS-95), Interim Standard 856 (IS-856), Global System forMobile communications (GSM), Enhanced Data rates for GSM Evolution(EDGE), GSM EDGE (GERAN), and the like. The WTRU 1500 and/or acommunication network (e.g., communication network 804) may implement aradio technology such as IEEE 802.11, IEEE 802.15, or the like.

FIG. 12 is a diagram illustrating an example two-wayscreen-content-sharing system 1600. The diagram illustrates a hostsub-system including capturer 1602, encoder 1604, and transmitter 1606.FIG. 12 further illustrates a client sub-system including receiver 1608(which outputs a received input bitstream 1610), decoder 1612, anddisplay (renderer) 1618. The decoder 1612 outputs to display picturebuffers 1614, which in turn transmits decoded pictures 1616 to thedisplay 1618. There are application requirements from industries forscreen content coding (SCC). See [R12], [R13]. Screen contentcompression methods are becoming important for some specificapplications because more and more people share their device content formedia presentation or remote desktop purposes. The screen displays ofmobile devices have greatly improved to support high definition orultra-high definition resolutions. Traditional video coding methodsincrease the bandwidth requirement for transmitting screen content inscreen-sharing applications.

As discussed above, FIG. 2 is a block diagram of a generic block-basedsingle layer decoder that receives a video bitstream produced by anencoder such as the encoder in FIG. 1, and reconstructs the video signalto be displayed. As also discussed above, at the video decoder, thebitstream is first parsed by the entropy decoder. The residualcoefficients are inverse quantized and inverse transformed to obtain thereconstructed residual. The coding mode and prediction information areused to obtain the prediction signal using either spatial prediction ortemporal prediction. The prediction signal and the reconstructedresidual are added together to get the reconstructed video. Thereconstructed video may additionally go through loop filtering beforebeing stored in the reference picture store to be displayed and/or to beused to decode future video signals. As shown in FIG. 1, to achieveefficient compression, a single layer encoder employs widely knowntechniques such as spatial prediction (also referred to as intraprediction) and temporal prediction (also referred to as interprediction and/or motion compensated prediction) to predict the inputvideo signal. The encoder also has mode decision logic that chooses themost suitable form of prediction, usually based on certain criteria suchas a combination of rate and distortion considerations. See [R11]. Theencoder then transforms and quantizes the prediction residual (thedifference signal between the input signal and the prediction signal).The quantized residual, together with the mode information (e.g., intraor inter prediction) and prediction information (motion vectors,reference picture indexes, intra prediction modes, etc.) are furthercompressed at the entropy coder and packed into the output videobitstream. As shown in FIG. 1, the encoder also generates thereconstructed video signal by applying inverse quantization and inversetransform to the quantized residual to obtain a reconstructed residual,and adding it back to the prediction signal. The reconstructed videosignal may additionally go through a loop filter process (for example,deblocking filter, Sample Adaptive Offsets, or Adaptive Loop Filters),and is finally stored in the reference picture store to be used topredict future video signal.

In order to save transmission bandwidth and storage, MPEG has beenworking on video coding standards for many years. High Efficiency VideoCoding (HEVC) (see [R13]) is an emerging video compression standard.HEVC is currently being jointly developed by ITU-T Video Coding ExpertsGroup (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG) together.It can save 50% bandwidth compared to H.264 with the same quality. HEVCis still a block-based hybrid video coding standard, in that its encoderand decoder generally operate according to the manner discussed above inconnection with FIG. 1 and FIG. 2. HEVC allows the use of larger videoblocks, and it uses quadtree partition to signal block codinginformation. The picture or slice is first partitioned into coding treeblocks (CTB) having the same size (e.g., 64×64). Each CTB is partitionedinto CUs with quadtree, and each CU is partitioned further intoprediction units (PU) and transform units (TU), also using quadtree. Foreach inter coded CU, its PU can be one of 8 partition modes, as shownand discussed above in connection with FIG. 3. Temporal prediction, alsocalled motion compensation, is applied to reconstruct all inter codedPUs. Depending on the precision of the motion vectors (which can be upto quarter pixel in HEVC), linear filters are applied to obtain pixelvalues at fractional positions. In HEVC, the interpolation filters have7 or 8 taps for luma and 4 taps for chroma. The deblocking filter inHEVC is content based; different deblocking filter operations areapplied at the TU and PU boundaries, depending on a number of factors,such as coding mode difference, motion difference, reference picturedifference, pixel value difference, and so on. For entropy coding, HEVCadopts context-based adaptive arithmetic binary coding (CABAC) for mostblock level syntax elements except high level parameters. There are twokinds of bins in CABAC coding: one is context-based coded regular bins,and the other is by-pass coded bins without context.

Although the current HEVC design contains various block coding modes, itdoes not fully utilize the spatial redundancy for screen content coding.This is because the HEVC is focused on continuous tone video content in4:2:0 format, and the mode decision and transform coding tools are notoptimized for the discrete tone screen content, which is often capturedin the format of 4:4:4 video. As the HEVC standard began to mature andstabilize in late 2012, the standardization bodies VCEG and MPEG startedto work on the future extension of HEVC for screen content coding. InJanuary 2014, the Call for Proposals (CFP) of screen content coding wasjointly issued by ITU-T VCEG and ISO/IEC MPEG. The CFP received a fairamount of attention and resulted in seven responses [R2]-[R8] fromvarious different companies proposing various efficient SCC solutions.Given that screen content material such as text and graphics showdifferent characteristics compared to natural video content, some novelcoding tools that improve the coding efficiency of screen content codingwere proposed, for example, 1D string copy [R9], palette coding [R10],[R11] and intra block copy (IntraBC) [R12], [R17]. All those screencontent coding related tools were investigated in core experiments[R18]-[R22]. Screen content has highly repetitive patterns in term ofline segments or blocks and many small homogeneous regions (e.g.mono-color regions). Usually only a few colors exist within a smallblock. In contrast, there are many colors even in a small block fornatural video. The color value at each position is usually repeated fromits above or horizontal neighboring pixel. 1D string copy involvespredicting the string with variable length from previous reconstructedpixel buffers. The position and string length will be signaled. Inpalette coding mode, instead of directly coding the pixel value, apalette table is used as a dictionary to record those significantcolors. And the corresponding palette index map is used to represent thecolor value of each pixel within the coding block. Furthermore, the“run” values are used to indicate the length of consecutive pixels thathave the same significant colors (i.e., palette index), to reduce thespatial redundancy. Palette coding is usually good for large blockscontaining sparse colors. An intra block copy involves using thereconstructed pixels to predict current coding blocks within the samepicture, and the displacement information—which is referred to as blockvector—is coded.

FIG. 13 is a diagram illustrating an example of full frame intra blockcopy mode. Considering the complexity and bandwidth access, in HEVCscreen content coding extension reference software (SCM-1.0) [R31] hastwo configurations for intra block copy mode. One is full frame intrablock copy mode, in which all reconstructed pixels can be used forprediction as shown generally at 1700 in FIG. 13. In order to reduce theblock vector searching complexity, hash based intra block copy searchwas proposed [R29], [R30]. Another is local region intra block copymode, which is discussed next.

FIG. 14 is a diagram illustrating an example of local region intra blockcopy mode, as shown generally at 1800. When local region intra blockcopy mode is used, only those reconstructed pixels in the left andcurrent coding tree units (CTU) are allowed to be used as reference.

There is another difference between SCC and natural video coding. Fornatural video coding, the coding distortion distributes in the wholepicture. However, for screen content, the error is usually localizedaround strong edges, which makes the artifacts more visible even whenthe PSNR (peak signal to noise ratio) is quite high for whole picture.Therefore, screen content is more difficult to encode from subjectivequality point of view.

The use intra block copying mode requires signaling of the block vector.In full frame intra block copy configuration, the block vector can bevery large, resulting in high overhead for intra block copy mode. Often,one block can find multiple similar matches because there is a highlyrepetitive pattern for screen content. In order to improve the blockvector coding efficiency, various prediction and coding methods havebeen proposed [R23]-[R28]. Embodiments of the presently disclosedsystems and methods use block vector derivation to improve intra blockcopy coding efficiency. Among the variations discussed and described inthis disclosure are (i) block vector derivation in intra block copymerge mode and (ii) block vector derivation in intra block copy withexplicit block vector mode.

Included in the present disclosure is discussion of a displacementinformation derivation method for intra block copy coding. Depending onthe coding type of a reference block, a derived block vector or motionvector can be used in different ways. One method is to use the derivedBV as a merge candidate in IntraBC merge mode; this option is discussedbelow in a subsection that is entitled “Intra Block Copy Merge Mode.”Another method is to use the derived BV/MV for normal IntraBCprediction; this option is discussed below in a subsection that isentitled “Intra Block Copy Mode with Derived Block Vector.”

FIG. 16 is a diagram illustrating an example block vector derivation.Given the block vector, the second block vector can be derived if thereference block pointed to by the given BV is an IntraBC coded block.The derived block vector is calculated in Eq. (4). FIG. 16 shows thiskind of block vector derivation generally at 2000.

BVd=BV0+BV1   (4)

FIG. 17 is a diagram illustrating an example motion vector derivation.If the block pointed to by the given BV is an inter coded block, thenthe motion vector can be derived. FIG. 17 shows the MV derivation casegenerally at 2100. If block B1 in FIG. 17 is uni-prediction mode, thenthe derived motion MVd in integer pixel for block B0 is

MVd=BV0+((MV1+2)>>2)   (5)

In some embodiments, the derived value MVd_q in quarter pixel resolutionis calculated as

MVd_q=(BV0<<2)+MV1

And the reference picture the same as that of B1. In HEVC, the normalmotion vector is quarter pixel precision, and the block vector isinteger precision. Integer pixel motion for derived motion vector isused by way of example here. If the block B1 is bi-prediction mode, thenthere are two ways to perform motion vector derivation. One is to derivetwo motion vectors for two directions separately and reference indicesthe same as uni-prediction mode. Another is to select the motion vectorfrom the reference picture with a smaller quantization parameter (higherquality). If both reference pictures have the same quantizationparameter, then we may select the motion vector from the closerreference picture in picture order of count (POC) distance (highercorrelation). The following discussion uses the example of using thesecond way to convert bi-prediction to uni-prediction to reduce thecomplexity.

Intra Block Copy Merge Mode

FIG. 15 is a diagram illustrating two examples of spatial candidates forintra block copy merge. In HEVC main profile [R13] and range extension[R17], inter coding unit merge mode will not signal the motioninformation directly, but signal an index in the inter merge candidatelist to the decoder. The inter merge candidate lists will be constructedin a deterministic way the same as that in the encoder. The motioninformation is derived from the candidate list using the index. In theexample that is numbered 1902 in FIG. 15, there are five spatialneighboring blocks and one temporal collocated block. Only those blockscoded in inter mode will be added in the inter merge candidate list. Ifthe candidate list is not full with spatial and temporal neighboringblocks, then bi-prediction motion by combining existing merge candidatesin two lists and zero motion will be appended.

For intra block copy merge mode, a similar method of applying the mergemode is carried out. No BV information is coded explicitly, but a mergecandidate index is coded. In the HEVC SCC extension (SCM-1.0) [R31], theBV is coded with differential coding using the last coded BV as itspredictor. In BV candidate list construction, the BV predictor ischecked first. If the BV predictor is valid for the current CU, then itis added as the first merge candidate. Then, in the examples numbered1902 and 1904 in FIG. 15, 5 spatial blocks are checked; those valid BVsare added in order if (1) the spatial neighboring block is IntraBC codedand therefore has a BV, (2) the BV is valid for the current CU (forexample, the reference block pointed to by the BV is not out of pictureboundary and is already coded), and (3) the BV has not appeared in thecurrent candidate list already. If the merge candidate list is not full,then BVs are derived with those valid BVs already in the list. In oneembodiment, only the derived block vector in equation (4) is considered,and the derived motion vector in equation (5) is not considered; in suchexamples, all of the merge candidates in the candidate list are blockvectors that correspond to intra block copy modes.

For complex design, the derived motion vector from equation (5) may bemixed together and added together with block vectors to the mergecandidates. Another embodiment is for each candidate block vector BV₀,if the BV_(d) or MV_(d) derived based on BV₀ is valid, then we willconsider the candidate block as bi-prediction mode with BV₀ andBV_(d)/MV_(d), where the bi-prediction is obtained by averaging a firstprediction obtained by applying the block vector BV₀, with a secondprediction obtained by applying the derived block or motion vectorBV_(d)/MV_(d).

FIGS. 18A and 18B together are a flowchart of an example method of intrablock copy merge candidate list construction. The example method 2200begins at step 2202, which is “IntraBC merge candidate derivation.” Themethod next proceeds into the dashed box that is labeled “Spatial BVCandidates Generation,” and in particular to step 2204, which is “Addthe BV predictor to merge candidate list if it is valid for current CU.”Processing next proceeds to step 2206, which is “Check the BV fromspatial neighboring blocks, add it to merge candidate list if it isvalid.”

The method next proceeds to a decision box 2208, where the followingcondition is evaluated: “((left, top, top right, bottom left Neighboringblocks are checked) ∥(num_of_cand_list>=max_num_of_merge_cand))?” If thecondition at 2208 is determined to be false, processing returns to step2206.

If the condition at 2208 is instead determined to be true, processingproceeds to decision box 2210, at which the following condition isevaluated: “(num_of_cand_list<max_num_of_merge_cand -1)?” If thecondition at 2210 is determined to be true, processing proceeds to step2212, which is “Check the BV of top left neighboring block, add it tomerge candidate list if it is valid.” If the condition at 2210 isdetermined to be false, step 2212 is bypassed. Either way, processingthen proceeds to FIG. 18B, into a dashed box entitled “BVd CandidatesGeneration,” and specifically to decision box 2216, at which thefollowing condition is evaluated: “((all spatial BV candidates in thelist checked)∥(num_of_cand_list>=max_num_of_merge_cand)?”

If the condition at 2216 is determined to be true, the processing endsat 2224. If the condition at 2216 is determined to be false, processingproceeds to step 2218, which is “Take one spatial BV from candidatelist, and derive BVd.”

Next, processing proceeds to a decision box 2220, at which the followingcondition is evaluated: “BVd is valid?” If the condition at 2220 isdetermined to be true, processing proceeds to step 2222, which is “addBVd to merge candidate list.” If the condition at 2220 is insteaddetermined to be false, then processing returns to the decision box2216.

Intra Block Copy Mode with Derived Block Vector

In normal intra block copy mode, the block vector will be signaledexplicitly for each prediction unit within coding unit. In someembodiments, this mode is extended by adding a flag to indicate if thesignaled block vector or the derived block vector is used in IntraBCprediction. If the flag is 0, then the signaled block vector is used forIntraBC prediction and no need to apply BV derivation. If the flag is 1,then the BV or MV is derived using equation (4) or equation (5) based onthe signaled block vector, and the derived BV or MV will be used forintra block copy prediction or motion compensated prediction.

Another embodiment is to add two flags to normal IntraBC mode. The firstflag is used to indicate whether the BV derivation process is applied ornot. If the first flag is 1, the second flag is coded to indicatewhether uni-prediction or bi-prediction is used. If second flag is 0,then only the derived BV or MV is used for intra block copy predictionor motion compensated prediction. Otherwise, if the second flag is 1,the signaled BV is used to generate the first prediction, and thederived BV or MV is used to generate the second prediction; the finalprediction is generated by averaging those two predictions, analogouslyto bi-prediction mode.

Memory Access Bandwidth Reduction for Block Vector Derivation

Block vector derivation operates using information regarding the blockcoding mode and block vector/motion vector, such as the information ofblock B1 in FIG. 16 and FIG. 17. For decoder chip design, there are twoways to store the mode/BV/motion information of all coded blocks. One isto store the information in external memory. This technique requiresaccess to external memory, thus increasing memory access bandwidth.Another technique is to cache the information in on-chip memory, whichincreases the cache size.

Two exemplary methods are described here for reducing the amount ofinformation required to be stored. One is to store that information withcoarse granularity. In HEVC, the original BV/MV information is storedbased on a 4×4 block size. The memory size will be greatly reduced bystoring the original BV/MV information in compressed form based onlarger block sizes, for example, in 16×16 block size. If 16×16 blocksize is used, the required BV storage has the same granularity ascompressed motion in HEVC. In this way, it is possible to cache thosedata in a reasonable size. The second solution is to cache thatinformation of coded blocks in a limited range instead of all blocksalready coded. For example, the decoder may only cache the informationpertaining to the current CTU row and a limited number of codedneighboring CTU rows above current CTU row. If the block B1 pointed bythe first BV in FIG. 16 and FIG. 17 is outside the range that thedecoder caches, then this BV will be regarded as invalid, and BV/MVderivation will not be applied.

Coding Syntax and Semantics

New syntax elements were proposed to signal the CUs that are coded withintra block copy merge and intra block copy with derived block vector,based on the current syntax design of the HEVC range extension draft[R17]. The proposed palette coding methods discussed in this section canbe signaled in the bit-stream by introducing additional syntax. Thefollowing table (Table 2) shows the proposed syntax elements, changes tothat of the HEVC range extension draft [R17] are contained in the linesnumbered [10], [12], [13], [14], [17], [27], [28], [29], [40], [41], and[42].

Syntax

TABLE 2 Descriptor [01] coding_unit( x0, y0, log2CbSize ) { [02] if(transquant_bypass_enabled_flag ) [03] cu_transquant_bypass_flag ae(v)[04] if( slice_type != I ) [05] cu_skip_flag[ x0 ][ y0 ] ae(v) [06] nCbS= ( 1 << log2CbSize ) [07] if( cu_skip_flag[ x0 ][ y0 ] ) [08]prediction_unit( x0, y0, nCbS, nCbS ) [09] else { [10] if(intra_block_copy_enabled_flag ) { [11] intra_bc_flag[ x0 ][ y0 ] ae(v)[12] if(intra_bc_flag[ x0 ][ y0 ]) [13] intra_bc_merge_flag[ x0 ][ y0 ]ae(v) [14] } [15] if( slice_type != I && !intra_bc_flag[ x0 ][ y0 ] )[16] pred_mode_flag ae(v) [17] if ( CuPredMode[ x0 ][ y0 ] != MODE_INTRA|| (intra_bc_flag[ x0 ][ y0 ] && !intra_bc_merge_flag[ x0 ][ y0 ] ) ||log2CbSize = = MinCbLog2SizeY ) [18] part_mode ae(v) [19] if(CuPredMode[ x0 ][ y0 ] = = MODE_INTRA ) { [20] if( PartMode = =PART_2Nx2N && pcm_enabled_flag && !intra_bc_flag[ x0 ][ y0 ] &&log2CbSize >= Log2MinIpcmCbSizeY && log2CbSize <= Log2MaxIpcmCbSizeY )[21] pcm_flag[ x0 ][ y0 ] ae(v) [22] if( pcm_flag[ x0 ][ y0 ] ) { [23]while( !byte_aligned( ) ) [24] pcm_alignment_zero_bit f(1) [25]pcm_sample( x0, y0, log2CbSize ) [26] } else if( intra_bc_flag[ x0 ][ y0] ) { [27] if( intra_bc_merge_flag[ x0 ][ y0 ] ) [28]intra_bc_merge_index[ x0 ][ y0 ] ae(v) [29] if( !intra_bc_merge_flag[ x0][ y0 ] ) { [30] mvd_coding( x0, y0, 2) [31] if( PartMode = = PART_2NxN) [32] mvd_coding( x0, y0 + ( nCbS / 2 ), 2) [33] else if( PartMode = =PART_Nx2N ) [34] mvd_coding( x0 + ( nCbS / 2 ), y0, 2) [35] else if(PartMode = = PART)NxN ) { [36] mvd_coding( x0 + ( nCbS / 2 ), y0, 2)[37] mvd_coding( x0, y0 + ( nCbS / 2 ), 2) [38] mvd_coding( x0 + ( nCbS/ 2 ), y0 + ( nCbS / 2 ), 2) [39] } [40] if( PartMode = = PART_2Nx2N )[41] intra_bc_bv_derivation_flag[ x0 ][ y0 ] ae(v) [42] } [43] } else {[44] pbOffset = (PartMode = = PART_NxN ) ? (nCbS / 2 ) : nCbS [45] for(j = 0; j < nCbS; j = j + pbOffset ) [46] for( i = 0; i < nCbS; i = i +pbOffset ) [47] prev_intra_luma_pred_flag[ x0 + i ][ y0 + j ] ae(v) [48]for( j = 0; j < nCbS; j = j + pbOffset ) [49] for( i = 0; i < nCbS; i =i + pbOffset ) [50] if( prev_intra_luma_pred_flag[ x0 + i ][ y0 + j ] )[51] mpm_idx[ x0 + i ][ y0 + j ] ae(v) [52] else [53]rem_intra_luma_pred_mode[ x0 + i ][ y0 + j ] ae(v) [54] if(ChromaArrayType = = 3 ) [55] for( j = 0; j < nCbS; j = j + pbOffset )[56] for( i = 0; i < nCbS; i = i + pbOffset ) [57]intra_chroma_pred_mode[ x0 + i ][ y0 + j ] ae(v) [58] else if(ChromaArrayType != 0 ) [59] intra_chroma_pred_mode[ x0 ][ y0 ] ae(v)[60] } [61] } else { [62] if( PartMode = = PART_2Nx2N ) [63]prediction_unit( x0, y0, nCbS, nCbS ) [64] else if( PartMode = =PART_2NxN ) { [65] prediction_unit( x0, y0, nCbS, nCbS / 2 ) [66]prediction_unit( x0, y0 + ( nCbS / 2 ), nCbS, nCbS / 2 ) [67] } else if(PartMode = = PART_Nx2N ) { [68] prediction_unit( x0, y0, nCbS / 2, nCbS) [69] prediction_unit( x0 + ( nCbS / 2 ), y0, nCbS / 2, nCbS ) [70] }else if( PartMode = = PART_2NxnU ) { [71] prediction_unit( x0, y0, nCbS,nCbS / 4 ) [72] prediction_unit( x0, y0 + ( nCbS / 4 ), nCbS, nCbS * 3 /4 ) [73] } else if( PartMode = = PART_2NxnD ) { [74] prediction_unit(x0, y0, nCbS, nCbS * 3 / 4 ) [75] prediction_unit( x0, y0 + ( nCbS * 3 /4 ), nCbS, nCbS / 4 ) [76] } else if( PartMode = = PART_nLx2N ) { [77]prediction_unit( x0, y0, nCbS / 4, nCbS ) [78] prediction_unit( x0 + (nCbS / 4 ), y0, nCbS *3 / 4, nCbS ) [79] } else if( PartMode = =PART_nRx2N ) { [80] prediction_unit( x0, y0, nCbS *3/4, nCbS ) [81]prediction_unit( x0 + ( nCbS * 3 / 4 ), y0, nCbS / 4, nCbS ) [82] } else{ /* PART_NxN */ [83] prediction_unit( x0, y0, nCbS / 2, nCbS / 2 ) [84]prediction_unit( x0 + ( nCbS / 2 ), y0, nCbS / 2, nCbS / 2 ) [85]prediction_unit( x0, y0 + ( nCbS / 2 ), nCbS / 2, nCbS / 2 ) [86]prediction_unit( x0 + ( nCbS / 2 ), y0 + ( nCbS / 2 ), nCbS / 2, nCb S/2) [87] } [88] } [89] if( !pcm_flag[ x0 ][ y0 ] ) { [90] if( CuPredMode[x0 ][ y0 ] != MODE_INTRA && !(PartMode == PART_2Nx2N && merge_flag[ x0][ y0 ]) || ( CuPredMode[ x0 ][ y0 ] == MODE_INTRA && intra_bc_flag[ x0][ y0 ] ) ) [91] rqt_root_cbf ae(v) [92] if( rqt_root_cbf ) { [93]MaxTrafoDepth = ( CuPredMode[ x0 ][ y0 ] = = MODE_INTRA ? (max_transform_hierarchy_depth_intra + IntraSplitFlag ) :max_transform_hierarchy_depth_inter ) [94] transform_tree( x0, y0, x0,y0, log2CbSize, 0, 0 ) [95] } [96] } [97] } [98] }

With respect to Table 2, the following is noted:

-   -   intra_bc_merge_flag[x0][y0] equal to 1 specifies that the        current coding unit is coded in merge mode, and the block vector        is selected from the merge candidates.        intra_bc_merge_flag[x0][y0] equal to 0 specifies that the coding        unit is not coded in merge mode, and the block vector of current        coding unit is coded explicitly. When not present, the value of        intra_bc_merge_flag is inferred to be equal to 0. The array        indices x0, y0 specify the location (x0, y0) of the top-left        luma sample of the considered coding block relative to the        top-left luma sample of the picture.    -   intra_bc_merge_index[x0][y0] specifies the index among merge        candidates that the block vector of current coding unit is the        same as. intra_bc_merge_index[x0][y0] shall be in the range of 0        to maximal number of intra block copying merge candidates        minus 1. The array indices x0, y0 specify the location (x0, y0)        of the top-left luma sample of the considered coding block        relative to the top-left luma sample of the picture.    -   intra_bc_bv_derivation_flag[x0][y0] equal to 1 specifies that        the derived BV or MV is used for current PU prediction.

Further Exemplary Embodiments

In an exemplary embodiment, a derived block vector is generated, and thederived block vector is used as a merge candidate in an intra block-copymerge mode. In some such methods, the derived block vector, BVd, isdetermined according to BVd=BV0+BV1, given block vector BVO.

In another exemplary embodiment, a derived motion vector is generated,and the derived motion vector is used as a merge candidate in an intrablock-copy merge mode. In some such methods, the derived motion vectorin integer pixel units, MVd, is determined according toMVd=BV0+((MV1+2)>>2), given block vector BV0.

In an exemplary embodiment, a derived block vector is generated, and thederived block vector is used for normal IntraBC prediction with intrablock copy method.

In another exemplary embodiment, a derived motion vector is generated,and the derived motion vector is used for normal IntraBC prediction withmotion compensation prediction method.

In an exemplary embodiment, a block vector (BV) candidate list isformed. Derived BVs (BVds) are generated, and the BVds are added to thecandidate list.

In some such embodiments, the formation of the BV candidate listincludes: adding the BV predictor if the BV predictor is valid for acurrent coding unit (CU), and checking five spatial blocks and addingthose respective spatial block BVs that are valid. In some suchembodiments, the spatial block BVs are added only if (i) the spatialneighboring block is IntraBC coded, (ii) the BV is valid for the currentCU and (iii) the BV has not appeared in the current candidate listalready. In some such embodiments, the BVds are generated only if themerge candidate list is not full. The BVds may be checked for validityprior to being added to the candidate list. In some embodiments, theBVds that are generated are only derived according to BVd=BV0+BV1. Insome such methods, derived motion vectors (MVds) are generated, and theMVds are added to the candidate list. In some embodiments, the BVds arederived according to both (i) BVd=BV0+BV1 and (ii) MVd=BV0+((MV1+2)>>2).In some embodiments, a candidate block is treated as bi-prediction modewith BV₀ and BV_(d)/MV_(d). A bi-prediction may first be obtained byaveraging a first prediction obtained by applying a block vector BV₀,with a second prediction obtained by applying a block or motion vectorBV_(d)/MV_(d).

In an exemplary embodiment, a video coded bitstream is received, andflag is identified in the bitstream indicating whether a signaled blockvector or a derived block vector is used in IntraBC prediction. If theflag is a first value, then the signaled block vector is used forIntraBC prediction. If the flag is a second value then the BV or MV isderived based on the signaled block vector. In some such methods, thederived BV or MV is used for intra block copy prediction or motioncompensated prediction.

In another exemplary embodiment, a coded video bitstream is received. Afirst flag in the bitstream is identified as indicating whether asignaled block vector or a derived block vector is used in IntraBCprediction. A second flag in the bitstream is identified as indicatingwhether uni-prediction or bi-prediction is used.

In some such embodiments, if the first flag is 1, and the second flag is0, then only derived BV/MV is used for intra block copy prediction ormotion compensated prediction. If the first flag is 1 and the secondflag is 1, the signaled BV is used to generate the first prediction, andthe derived BV/MV is used to generate the second prediction, and thefinal prediction is generated by averaging those two predictions.

In an exemplary embodiment, original BV/MV information is compressed bystoring the BV/MV information based on larger block sizes. The largerblock size may be, for example, a 16×16 block size.

In another exemplary embodiment, a decoder is used to cache informationof coded blocks in a limited range that is less than all blocks alreadycoded. In some such methods, the decoder only caches information of acurrent CTU row and a predetermined number of coded neighboring CTU rowsabove the current CTU row.

In an exemplary embodiment, a video coding method of deriving predictivevector is provided. The method includes identifying a first candidateblock vector for prediction of a video block, wherein the firstcandidate block vector points to a first candidate block. At least afirst predictive vector associated with the first candidate block isidentified. A derived predictive vector is generated from the firstcandidate block vector and the first predictive vector, and the videoblock is coded using the derived predictive vector. In some embodiments,the coding of the video block using the derived predictive vectorincludes identifying a second candidate block that the derivedpredictive vector points to and predicting the video block using thesecond candidate block.

In some embodiments, coding the video block includes signaling the firstcandidate block vector in a bit stream. In some embodiments, coding thevideo block further includes signaling the first predictive vector inthe bit stream. In some embodiments, coding the video block furtherincludes signaling a flag in the bit stream, wherein a first value ofthe flag indicates that the derived predictive vector is used to codethe video block and wherein a second value of the flag indicates thatthe first candidate block vector is used to code the video block. Insome embodiments, encoding the input video block in the bit streamincludes encoding in the bit stream an index identifying the derivedpredictive vector in a merge candidate list.

The derived predictive vector is generated in some embodiments by addingthe first candidate block vector and the first predictive vector. Wherethe first candidate block vector and the first predictive vector havedifferent precisions, the adding of the first block vector and the firstpredictive vector may be performed after aligning them to the sameprecision.

In some embodiments the video coding method further includes generatinga merge candidate list and inserting the derived predictive vector intothe merge candidate list. In some such embodiments, a determination ismade of whether the derived predictive vector is valid, and the derivedpredictive vector is inserted into the merge candidate list only afterdetermining that the derived predictive vector is valid. Coding thevideo block using the derived predictive vector includes providing anindex identifying the derived predictive vector in the merge candidatelist. The determination of whether the derived predictive vector isvalid includes, in some embodiments, identifying a second candidateblock that the derived predictive vector points to, determining whetherall samples in the second candidate block are available. The derivedpredictive vector is determined to be valid if all samples in the secondcandidate block are available. The derived predictive vector isdetermined not to be valid if at least one sample in the secondcandidate block is not available. In some embodiments, a sample in thesecond candidate block is unavailable if any of the following is true:the sample is not yet coded, or the sample is in a different slice or ina different tile, or the sample is out of a video picture boundary.

In an exemplary embodiment, a method is provided of decoding a codedvideo block from a bit stream encoding a video. At least a firstcandidate block vector is identified for prediction of the input videoblock, wherein the first candidate block vector points to a firstcandidate block. At least a first predictive vector associated with thefirst candidate block is identified. A derived predictive vector isgenerated from the first block vector and the first predictive vector,and the coded video block is decoded using the derived predictivevector. In some such embodiments, identification of a first candidateblock vector includes receiving the first candidate block vectorsignaled in the bit stream.

In some such embodiments, decoding of the coded video block using thederived predictive vector is performed in response to receiving a flagin the bit stream indicating that the input video block is encoded witha derived predictive vector.

In some embodiments, identification of a first candidate block vectorincludes identification of a first block vector merge candidate. In someembodiments, the derived predictive vector is a derived predictivevector merge candidate, and decoding the coded video block using thederived predictive vector merge candidate is performed in response toreceiving an index in the bit stream identifying the derived predictivevector merge candidate.

In an exemplary embodiment, a video encoding method is provided forgenerating a bit stream encoding a video including an input video block.A neighboring block of the input video block is identified. Theneighboring block may be, for example, the left, top, or top-leftneighbor of the input video block. A first block vector associated withthe neighboring block is identified, where the first block vector pointsto a first candidate block. A second block vector associated with thefirst candidate block is identified. A derived block vector is generatedby adding the first block vector and the second block vector, and afirst prediction block is generated for prediction of the input videoblock using the derived block vector.

In some such embodiments, the encoding of the input video block furtherincludes generating at least a second prediction block for prediction ofthe input video block using a third block vector. The first predictionblock and the second prediction block are compared, and the predictionblock and its associated block vector are selected based on an encodingmetric. The encoding metric may be, for example, a Lagrangianrate-distortion cost.

In an exemplary embodiment, a video encoder is provided for generating abit stream to encode a video that includes an input video block. Theencoder including a processor and a non-transitory storage mediumstoring instructions operative, when executed on the processor, toperform functions including: identifying at least a first candidateblock vector for prediction of the input video block, wherein the firstcandidate block vector points to a first candidate block; identifying afirst predictive vector used to encode the first candidate block;generating a derived predictive vector from the first candidate blockvector and the first predictive vector; and encoding the input videoblock in the bit stream using the derived predictive vector forprediction of the input video block.

In an exemplary embodiment, a video encoder is provided for generating abit stream encoding a video including an input video block. The encoderincludes a processor and a non-transitory storage medium storinginstructions operative, when executed on the processor, to performfunctions including: identifying at least a first block vector mergecandidate for encoding of the input video block; identifying a firstpredictive vector used to encode the first candidate block; generating aderived predictive vector from the first block vector merge candidateand the first predictive vector; inserting the derived predictive vectorin a merge candidate list; from the merge candidate list, choosing aselected predictive vector for prediction of the input video block; andencoding the input video block in the bit stream using the selectedpredictive vector for prediction of the input video block.

In an exemplary embodiment, a video decoder is provided for decoding acoded video block from a bit stream encoding a video, the decoderincluding a processor and a non-transitory storage medium storinginstructions operative, when executed on the processor, to performfunctions including: identifying at least a first block vector fordecoding of the coded video block; identifying a first predictive vectorused to encode the first block vector; generating a derived predictivevector from the first block vector and the first predictive vector; anddecoding the coded video block using the derived predictive vector forprediction of the coded video block.

Although features and elements are described above in particularcombinations, one of ordinary skill in the art will appreciate that eachfeature or element can be used alone or in any combination with theother features and elements. In addition, the methods described hereinmay be implemented in a computer program, software, or firmwareincorporated in a computer-readable medium for execution by a computeror processor. Examples of computer-readable media include electronicsignals (transmitted over wired or wireless connections) andcomputer-readable storage media. Examples of computer-readable storagemedia include, but are not limited to, a read only memory (ROM), arandom access memory (RAM), a register, cache memory, semiconductormemory devices, magnetic media such as internal hard disks and removabledisks, magneto-optical media, and optical media such as CD-ROM disks,and digital versatile disks (DVDs). A processor in association withsoftware may be used to implement a radio frequency transceiver for usein a WTRU, UE, terminal, base station, RNC, or any host computer.

REFERENCES

-   [R1] ITU-T Q6/16 and ISO/IEC JCT1/SC29/WG11, “Joint Call for    Proposals for Coding of Screen Content”, MPEG2014/N14175, Jan. 2014,    San Jose, USA.-   [R2] J. Chen, Y. Chen, T. Hsieh, R. Joshi, M. Karczewicz, W.-S.    Kim, X. Li, C. Pang, W. Pu, K. Rapaka, J. Sole, L. Zhang, F. Zou,    “Description of screen content coding technology proposal by    Qualcomm”, JCTVC-Q0031, Mar. 2014, Valencia, ES.-   [R3] C.-C. Chen, T.-S. Chang, R.-L. Liao, C.-W. Kuo, W.-H. Peng,    H.-M.

Hang, Y.-J. Chang, C.-H. Hung, C.-C. Lin, J.-S. Tu, E.-C. Ke, J.-Y. Kao,C.-L. Lin, F.-D. Jou, F.-C. Chen, “Description of screen content codingtechnology proposal by NCTU and ITRI International”, JCTVC-Q0032, Mar.2014, Valencia, ES.

-   [R4] P. Lai, T.-D. Chuang, Y.-C. Sun, X. Xu, J. Ye, S.-T. Hsiang,    Y.-W.

Chen, K. Zhang, X. Zhang, S. Liu, Y.-W. Huang, S. Lei, “Description ofscreen content coding technology proposal by MediaTek”, JCTVC-Q0033,Mar. 2014, Valencia, ES.

-   [R5] Z. Ma, W. Wang, M. Xu, X. Wang, H. Yu, “Description of screen    content coding technology proposal by Huawei Technologies”,    JCTVC-Q0034, Mar. 2014, Valencia, ES.-   [R6] B. Li, J. Xu, F. Wu, X. Guo, G. J. Sullivan, “Description of    screen content coding technology proposal by Microsoft”,    JCTVC-Q0035, Mar. 2014, Valencia, ES.-   [R7] R. Cohen, A. Minezawa, X. Zhang, K. Miyazawa, A. Vetro, S.

Sekiguchi, K. Sugimoto, T. Murakami, “Description of screen contentcoding technology proposal by Mitsubishi Electric Corporation”,JCTVC-Q0036, Mar. 2014, Valencia, ES.

-   [R8] X. Xiu, C.-M. Tsai, Y. He, Y. Ye, “Description of screen    content coding technology proposal by InterDigital”, JCTVC-Q0037,    Mar. 2014, Valencia, ES.-   [R9] T. Lin, S. Wang, P. Zhang, and K. Zhou, “AHG8: P2M based    dual-coder extension of HEVC”, Document no JCTVC-L0303, Jan. 2013.-   [R10] X. Guo, B. Li, J.-Z. Xu, Y. Lu, S. Li, and F. Wu, “AHG8:    Major-color-based screen content coding”, Document no JCTVC-00182,    Oct. 2013.-   [R11] L. Guo, M. Karczewicz, J. Sole, and R. Joshi, “Evaluation of    Palette Mode Coding on HM-12.0+RExt-4.1”, JCTVC-00218, Oct. 2013.-   [R12] C. Pang, J. Sole, L. Guo, M. Karczewicz, and R. Joshi,    “Non-RCE3: Intra Motion Compensation with 2-D MVs”, JCTVC-N0256,    July 2013.-   [R13] B. Bross, W-J. Han, G. J. Sullivan, J-R. Ohm, T. Wiegand,    “High Efficiency Video Coding (HEVC) Text Specification Draft 10”,    JCTVC-L1003. Jan 2013.-   [R14] G.J. Sullivan and T. Wiegand, Rate-distortion optimization for    video compression. IEEE Signal Processing Magazine, vol. 15, issue    6, November 1998.-   [R15] T. Vermeir, “Use cases and requirements for lossless and    screen content coding”, JCTVC-M0172, Apr. 2013, Incheon, KR.-   [R16] J. Sole, R. Joshi, M. Karczewicz, “AhG8: Requirements for    wireless display applications”, JCTVC-M0315, Apr. 2013, Incheon, KR.-   [R17] D. Flynn, M. Naccari, K.Sharman, C. Rosewarne, J. Sole, G. J.    Sullivan, T. Suzuki, “HEVC Range Extension Draft 6”, JCTVC-P1005,    Jan. 2014, San Jose.-   [R18] J. Sole, S. Liu, “HEVC Screen Content Coding Core Experiment 1    (SCCE1): Intra Block Copying Extensions”, JCTVC-Q1121, Mar. 2014,    Valencia.-   [R19] C.-C. Chen, X. Xu, L. Zhang, “HEVC Screen Content Coding Core    Experiment 2 (SCCE2): Line-based Intra Copy”, JCTVC-Q1122, Mar.    2014, Valencia.-   [R20] Y.-W. Huang, P. Onno, R. Joshi, R. Cohen, X. Xiu, Z. Ma, “HEVC    Screen Content Coding Core Experiment 3 (SCCE3): Palette mode”,    JCTVC-Q1123, Mar. 2014, Valencia.-   [R21] Y. Chen, J. Xu, “HEVC Screen Content Coding Core Experiment 4    (SCCE4): String matching for sample coding”, JCTVC-Q1124, Mar. 2014,    Valencia.-   [R22] X. Xiu, J. Chen, “HEVC Screen Content Coding Core Experiment 5    (SCCE5): Inter-component prediction and adaptive color transforms”,    JCTVC-Q1125, Mar. 2014, Valencia.-   [R23] P. Onno, G. Laroche, T. Poirier, C. Gisquet, “AhG5: On the    displacement vector prediction scheme for Intra Block Copy”,    JCTVC-Q0062, Mar. 2014, Valencia.-   [R24] X. Zhang, K. Zhang, J. An, H. Huang, S. Lei, “Block vector    prediction for intra block copy”, JCTVC-Q0080, Mar. 2014, Valencia.-   [R25] K. Zhang, J. An, X. Zhang, H. Huang, S. Lei, “Symmetric intra    block copy”, JCTVC-Q0082, Mar. 2014, Valencia.-   [R26] S.-T. Hsiang, T.-D. Chuang, S. Lei, “AHG8; Coding the    prediction differences of the intra BC vectors”, JCTVC-Q0095, Mar.    2014, Valencia.-   [R27] C. Pang, J. Sole, R. Joshi, M. Karczewicz, “Block vector    prediction method for intra block copy”, JCTVC-Q0114, Mar. 2014,    Valencia.-   [R28] L. Zhu, J. Xu, G. J. Sullivan, Y. Wu, S. Sankuratri,    B.A.Kumar, “Ping-pong block vector predictor for intra block copy”,    JCTVC-Q0134, Mar. 2014, Valencia.-   [R29] B. Li, J. Xu, “Hash-based intraBC search”, JCTVC-Q0252, Mar.    2014, Valencia.-   [R30] C. Pang, J .Sole, T. Hsieh, M. Karczewicz, “Intra block copy    with larger search region”, JCTVC-Q0139, Mar. 2014, Valencia.-   [R31] R. Joshi, J. Xu, R. Cohen, S. Liu, Z. Ma, Y. Ye, “Screen    content coding test model 1 (SCM 1)”, JCTVC-Q1014, Mar. 2014,    Valencia.

What is claimed is:
 1. A video decoding method comprising: decoding aplurality of blocks; storing information of at least some decoded blocksin a same row as a current block; and decoding the current block withintra block copy coding based at least in part on the storedinformation.
 2. The method of claim 1, wherein storing information of atleast some decoded blocks in a same row as a current block comprisescaching information of at least some decoded blocks in a same row as acurrent block.
 3. The method of claim 1, wherein decoding the currentblock with intra block copy coding comprises: decoding a block vectorfrom a bitstream; and generating a prediction of the current block basedon the block vector.
 4. The method of claim 3, wherein decoding thecurrent block with intra block copy coding further comprises: decodingfrom the bitstream a residual of the current block; and adding theresidual to the prediction of the current block to generate areconstructed video block.
 5. The method of claim 3, wherein the blockvector points to a reference block in the same row as the current block,and wherein the prediction of the current block is based on thereference block.
 6. The method of claim 1, wherein the decoded blocks inthe same row as the current block are decoded blocks in a same row ofcoding tree units (CTUs) as the current block.
 7. A video encodingmethod comprising: selecting a block vector for encoding of a currentblock, wherein the block vector is selected to point to a referenceblock in a same row as the current block; and encoding the current blockwith intra block copy coding using the selected block vector.
 8. Themethod of claim 7, wherein the decoded blocks in the same row as thecurrent block are decoded blocks in a same row of coding tree units(CTUs) as the current block.
 9. The method of claim 7, wherein encodingthe current block comprises encoding the block vector in a bitstream.10. The method of claim 7, wherein encoding the current block comprisesgenerating a prediction of the current block based on the block vector.11. The method of claim 10, wherein encoding the current block furthercomprise: subtracting the prediction of the current block from an inputvideo block to obtain a residual; and encoding the residual in abitstream.
 12. A video decoding apparatus comprising a processorconfigured to perform at least: decoding a plurality of blocks; storinginformation of at least some decoded blocks in a same row as a currentblock; and decoding the current block with intra block copy coding basedat least in part on the stored information.
 13. The apparatus of claim12, wherein storing information of at least some decoded blocks in asame row as a current block comprises caching information of at leastsome decoded blocks in a same row as a current block.
 14. The apparatusof claim 12, wherein decoding the current block with intra block copycoding comprises: decoding a block vector from a bitstream; andgenerating a prediction of the current block based on the block vector.15. The apparatus of claim 14, wherein decoding the current block withintra block copy coding further comprises: decoding from the bitstream aresidual of the current block; and adding the residual to the predictionof the current block to generate a reconstructed video block.
 16. Theapparatus of claim 14, wherein the block vector points to a referenceblock in the same row as the current block, and wherein the predictionof the current block is based on the reference block.
 17. The apparatusof claim 12, wherein the decoded blocks in the same row as the currentblock are decoded blocks in a same row of coding tree units (CTUs) asthe current block.